Optimisation of Partitioned Termporal Joins

نویسنده

  • Thomas Zurek
چکیده

Joins are the most expensive and performance-critical operations in relational database systems. In this thesis, we investigate processing techniques for joins that are based on a temporal intersection condition. Intuitively, such joins are used whenever one wants to match data from two or more relations that is valid at the same time. This work is divided into two parts. First, we analyse techniques that have been proposed for equi-joins. Some of them have already been adapted for temporal join processing by other authors. However, hash-based and parallel techniques – which are usually the most efficient ones in the context of equijoins – have only found little attraction and leave several temporal-specific issues unresolved. Hash-based and parallel techniques are based on explicit symmetric partitioning. In the case of an equi-join condition, partitioning can guarantee that the relations are split into disjoint fragments; in the case of a temporal intersection condition, partitioning usually results in non-disjoint fragments with a large number of tuples being replicated between fragments. This causes a considerable overhead for partitioned temporal join processing. This problem is an instance of the ‘min-max dilemma’: minimising the number of replicated tuples means minimising the number of fragments, thus minimising the degree of parallelism – however, increasing the number of fragments and therefore the degree of parallelism also increases the number of tuple replications. We analyse this problem and show that there is an algorithm of polynomial time complexity that computes an optimal solution for the interval partitioning problem (IP). This result concludes the analytical part. In the second, the synthetical part of this work, we focus on the conclusions that can be drawn from the results of the first part. We propose and develop an optimisation process that • analyses the temporal relations that participate in a temporal join, • proposes several possible partitions for these relations, • analyses these partitions and predicts their performance implications on the basis of a parameterised cost model, and • chooses the cheapest partition to process the temporal join. We also show how this process can be efficiently implemented by using a new index structure, called the IP-table. The thesis is concluded by a thorough experimental evaluation of the optimisation process and a chapter that shows the suitability of IP-tables in a wider context of temporal query optimisation, namely using them to estimate selectivities of temporal join conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Memory-Efficient Hash Joins

We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded populatio...

متن کامل

Using Parallelism and Pipeline for the Optimisation of Join Queries

In this study we present a technique for the parallel optimisation of join queries, that uses the offered coarse-grain parallelism of the underlying architecture in order to reduce the CPU-bound optimisation overhead. The optimisation technique performs an almost exhaustive search of the solution space for small join queries and gradually, as the number of joins increases, it diverges towards i...

متن کامل

The Choice of the Solution Space for Optimisation of Parallel Queries with Large Joins

The choice of the correct solution space to be searched by a query optimiser is one of the important factors that govern the success of any query optimiser. The importance of the solution space becomes more important for parallel queries that consist of large number of joins due to the increase of the size of the solution space. In this paper we study the effect of varying different parameters ...

متن کامل

An Evaluation of Non-Equijoin Algorithms

A non-equijoin of relations R and S is a band join if the join predicate requires values in the join attribute of R to fall within a speci ed band about the values in the join attribute of S. We propose a new algorithm, termed a partitioned band join, for evaluating band joins. We present a comparison between the partitioned band join algorithm and the classical sort-merge join algorithm (optim...

متن کامل

Emergency department resource optimisation for improved performance: a review

Emergency departments (EDs) have been becoming increasingly congested due to the combined impacts of growing demand, access block and increased clinical capability of the EDs. This congestion has known to have adverse impacts on the performance of the healthcare services. Attempts to overcome with this challenge have focussed largely on the demand management and the application of system wide p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997